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IDENTIFIABILITY OF DIRECTED GAUSSIAN GRAPHICAL 
MODELS WITH ONE LATENT SOURCE 

DENNIS LEUNG, MATHIAS DRTON, AND HISAYUKI HARA 


Abstract. We study parameter identifiability of directed Gaussian graphical 
models with one latent variable. In the scenario we consider, the latent vari¬ 
able is a confounder that forms a source node of the graph and is a parent to all 
other nodes, which correspond to the observed variables. We give a graphical 
condition that is sufficient for the Jacobian matrix of the parametrization map 
to be full rank, which entails that the parametrization is generically finite-to- 
one, a fact that is sometimes also referred to as local identifiability. We also 
derive a graphical condition that is necessary for such identifiability. Finally, 
we give a condition under which generic parameter identifiability can be deter¬ 
mined from identifiability of a model associated with a subgraph. The power 
of these criteria is assessed via an exhaustive algebraic computational study 
on models with 4, 5, and 6 observable variables. 


1. Introduction 


In this paper we study parameter identifiability in directed Gaussian graphical 
models with a latent variable. Our work falls in a line of work where the graph¬ 
ical representation of causally interpretable latent variable models is used to give 
tractable criteria to decide whether parameters can b e uniquely recovered from 
the joint distribution of the observed variables (Peari 2Q09). Som e examples of 


prior w ork in this context are Chei^^aL J20M[) 


Drton et al 


(2012), Grzebvk et al. (2004 1. IKuroki and Mivakawal (20(<4 ). IKuroki and Pearl 


(2011), [FoYgel et al 


( 2014 1. Stanghellini and Wermuth ( 2005 1. Tian ( 2005 1. and Tianl ( 20091) . 

The setup we consider has a single latent variable appear as a source node in the 
directed graph defining the Gaussian model. The resulting models can be described 
as follows. Let X \...., X rn be observable variables, and let L be a hidden variable, 
and suppose the variables are related by linear equations as 


X„ = 


X'tiyt: X ln -)- ( 5 ,, L -\- t v 


V = 1, 


,m, 


W^V 


where X wv , S v are real coefficients quantifying linear relationships, and the e v are 
independent mean zero Gaussian noise terms with variances ui v > 0. The latent 
variable L is assumed to be standard normal and independent of the noise terms 
t v - Letting X = (Xi,... , X m ) T , e = (ei,...,e m ) T and 6 = (5i,.. ., 5 m ) T , we may 
present the model in the vectorized form 

(1.1) X = A T X + SL + e, 

where A is the matrix (X wv ) with X vv = 0 for all v = We are then 

interested in specific models, in which for certain pairs of nodes w ^ v the coefficient 
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X wv is constrained to zero. In particular, we are interested in recursive models, that 
is, models in which the matrix A can be brought into strictly upper triangular form 
by permuting the indices of the variables (and thus the rows and columns of A). 
This implies that / m — A is invertible, where I m is the to x to identity matrix. It 
follows that the observable variate vector X has a m-variate normal distribution 
N m (0,E) with covariance matrix 

(1.2) E = (Im- A T ) -1 (fI + SS T )(I m - A)- 1 , 


where Q is the diagonal matrix with fi„ 7 , = a k,- For additional back ground on 
graphical models we refer the reader to lLauritzen ( 1996 ) and Pearll 1 2009 ). We note 
that the m odel s we c onsider also belong to the class of linear structural equation 
models ( Bollen . 1989l l. 

A Gaussian latent variable model postulating recursive zero structure in the 
matrix A from m can be thought of as associated with a graph G = (V, E) whose 
vertex set V = {1,..., to} is the index set for the observable variables X\, ..., X m . 
For two distinct nodes w,v G V, the edge set E includes the directed edge (w,v), 
denoted as w —> v if and only if the model includes X wv as a free parameter. 
When the model is recursive, the directed graph G is acyclic and following common 
terminology we refer to G as a DAG (for directed acyclic graph). In this paper, we 
will then always assume that the nodes are labeled in topological order, that is, we 
have V = (1,..., to} and w —> v G E only if w < v. 

To emphasize the presence of the latent variable L, one could equivalently rep¬ 
resent the model by an extended DAG G = (V, E) on to + 1 nodes enumerated 
as F := (0,1,..., m}, where the node 0 corresponds to the latent variable L, and 
if G = (V, E) is the graph on to nodes representing the model in the preceding 
paragraph, then E = E U{0—>u:uG{l,..., to}}. The edges 0 —>■ v correspond 
to the coefficients S v . 

For the DAG G = {V, E), let 


Re ■- (A = (X wv ) G R mxm :w->^£^A„=0} 

be the linear space of coefficient matrices, and let diag)^ be the set of all to x to. 
diagonal matrices with a positive diagonal. 


Definition 1.1. The Gaussian one latent source model associated with a given 
DAG G = ( V,E ), denoted as A f*(G), is the family of all to- variate normal distri¬ 
butions N m ( 0, £) with a covariance matrix of the form 

S = (I m - A t ) _1 (D -l- SS T )(I m — A) -1 , 
for A G Re, ft G diag+ and <5 G R m . 

The model A/"* ( G ) has the parametrization map 

(1.3) 0 G : (A, Q, S) ► (I m - A t ) -1 (D + SS T )(I m - A) -1 

defined on the set 0 := R# x diag)^ xR m , which we may also view as an open 
subset of R 2m +I s l, where |£7| is the cardinality of the directed edge set E. Clearly, 
the image of <f>o is in PD m , the cone of positive definite to, x m matrices. Note that 
since G is acyclic, we have (I m — A)” 1 = I m + A + A 2 + • • • + A m_1 and thus the 
covariance parametrization </>g is a polynomial map. 

In this paper we will derive graphical conditions on G that are sufficient/necessary 
for identifiability of the model A f*(G). We begin by clarifying what precisely we 
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mean by identifiability. The most stringent notion, namely that of global identi- 
fiability, requ ires (b n to be injective on all of 0. While this notion is important 
( Drton et al . I, 2T)TT1i . it is too stringent for the setting we consider here. Indeed, for 
any triple (A, SI, S) £ 0, ^g(A, f2, 5) = <j>a{ A, fl, — 5 ), which implies that the fiber 

{(A', fi', S')£Q: 0 g (A, n, S) = </> G (A', SI', 6')} 


always has cardinality > 2. We may account for this symmetry by requiring (f>G to 
be 2-to-l on all of 0 but this is not enough as there are always some fibers that 
are infinite. For instance, it is easy to show that the fiber in the above display is 
infinite when 5 = 0. As such, it is natural to consider notions of generic identi¬ 
fiability. Specifically, our contributions will pertain to the notion of generic finite 
identifiability, as defined below, that only requires finite identification of parameters 
away from a fixed null set in 0; here a null set is a set of Lebesgue measure zero. 
This notion is a lso ref erred to as local identifiability in other related work such as 
Anderson and Rubin ( 19561 ). 


Null sets appearing in our work are algebraic sets, where an algebraic set A C 
is the set of common zeros of a collection of multivariate polynomials, i.e., 


A = {a £ R” : fi{a) = 0, i = l,...,k}, 

for fi £ ..., £„], where Rfaq,..., x n ] is the ring of polynomials in n variables 

with coefficients in R. Note that A is a closed set in the usual Euclidean topology. If 
all polynomials f. t are the zero polynomial then A = R”. Otherwise, A is a proper 
subset, A C R n , and its dimension is then less than n. In particular, a proper 
algebraic subset of R” has measure zero. 


Definition 1.2. Let S be an open subset of R", and let / be a map defined on S. 
Then / is said to be generically finite-to-one if there exists a proper algebraic set 
5ct" such that the fiber of s, i.e. the set {s' £ S : f(s') = f(s)}, is finite for all 
s £ S\S. Otherwise, / is said to be generically infinite-to-one. 

Definition 1.3. The model J\f*(G) of a given DAG G = (V,E) is said to be 
generically finitely identifiable if its parametrization </>g defined on 0 is generically 
finite-to-one. We also say the DAG G is generically finitely identifiable for short. 

Hereafter for any map / defined on an open domain S C R™, we will use 

(1.4) T f (s):={s'£S:f(s') = f(s)} 

to denote the fiber of a point s £ S. If T is a subset of S, we will use /\t to denote 
the restriction of / to T, in which case for any t £ T, we have the fiber 

Tf W {f) = {t £T:f{t')=f(t)}. 

The term “generic point” will refer to any point in the domain S that lies outside 
a fixed proper algebraic subset S , and a property is said to hold generically if it 
holds everywhere on S\S. The following well-known lemma is a main tool in this 
paper, and its proof will be included in Appendix [A] for completeness. It gives as 
an immediate corollary a trivial necessary condition for generic finite identifiability. 

Lemma 1.1. Suppose f : S —> is a polynomial map defined on an open set 
S C R". The following statements are equivalent: 

(i) f is generically finite-to-one. 
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Figure 1.1. A DAG G that satisfies the sufficient condition in 
Theorem 11.31 its undirected complement G c is shown on the right. 


(ii) There exists a proper algebraic subset S C M™ such that the fibers of the 
restricted map f\ S \g are all finite, i.e. |.Ff| g (s)| < oo for all s £ S \ S. 
(Hi) The Jacobian matrix of f is generically of full column rank. 

Corollary 1.2. Given a DAG G = ( V., E), a necessary condition for generic finite 
identifiability of its associated model J\f*{G) is that — 2m > \E\. 

Proof of Corollarv \l.‘A The Jacobian matrix of fc is of size ( m ^~ 1 ) x (|£’| + 2m), 
and it is necessary that ) > | E\ + 2m for it to have full column rank. □ 


Property ( ii ) is seemingly weaker than (i) in Lemma ll.il It is useful in proving 
our results in Section [5] In light of Corollary 1 1.21 for the rest of this paper we will 
restrict our attention to DAGs G = (V,E) with ( m ^“ ) — 2 m > \E \, in which case 
to must be at least 3. 

One of our contributions is a sufficient graphical condition stated in Theorem ll.3l 
below. For v ^ w £ V, we will use v — w or w — v to denote the edge ( v , ui) = 
(w, v) of an undirected graph on V. With slight abuse of notation, we may also use 
v — w or w — v to denote an edge v -A w £ E when the directionality of edges in 
a DAG G = ( V , E) is to be ignored. For any directed/undirected graph G = (V , E), 
the complement of G, denoted as G c = (V , E c ), is the undirected graph on V with 
the edge set E c = {v — w : (v, w) £ E and ( w , v) qL E}. 


Theorem 1.3 (Sufficient condition for generic finite identifiability). The model 
Af*(G) given by a DAG G = (V,E) is generically finitely identifiable if every con¬ 
nected component of G c contains an odd cycle. 


Figure [IT] shows a DAG G that satisfies the sufficient condition in Theorem 11.31 
its undirected complement G c is shown on the right of the figure. We will revisit 
this example in Section [Tj where we report on algebaric computations that show 
that for this graph G the fibers of (fc are generically of size 2 or 4. 

Our approach to proving Theorem ll.3l also yields a necessary condition for generic 
finite identifiability. This condition can be stated in terms of two undirected graphs 
on the node set V, denoted G\l <cov = (V, E\ L cov ) and G con = (V,E con ), where 
E\l,cov captures the dependency of variable pairs after conditioning on the latent 
variable L , and E con captures the dependency of variable pairs after conditioning on 
all other variables. From m it can be seen that E|^ := (/ m — A T ) 1 D(/ m — A) 1 
is the covariance matrix of X conditioning on L , hence v — w £ E\^ cov if and only 
if (E|l)i,u> ^ 0, and analogously v — w £ E con if and only if (Ej^ 1 ),,^ ^ 0. It is well 
known that these two undirected graphs can be obta ined by using the d-separation 


criterion applied to the extended DAG G; see Drton et al. ( 20091 . p. 73) for example. 
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Figure 1.2. A graph G (left), G con (middle) and G c con (right). 
Since \E con \ — \E\ = 1 < 2 = d con , the necessary condition in 
Thm. 11.41 does not hold. 


Theorem 1.4 (Necessary condition for generic finite identifiability). Given a DAG 
G = (V, E), for the model J\T*(G) to be generically finitely identifiable, it is necessary 
that the following two conditions both hold: 

(i) \E con \ — \E\ > d con , where d con is the number of connected components in 
the graph ( G con ) c that do not contain any odd cycle; 

(ii) |-E|l jCO „| — \E\ > d cov , where d cov is the number of connected components 
in the graph (G|l >co «) c that do not contain any odd cycle. 


Figure [L^] gives an example of a DAG that fails to satisfy our necessary condition, 
specifically, condition (ii). 

In addition to the closely related work of Stanghellini i ( 199 7) and Vicardl ( 2000 ). 
ide ntifiability of directed Gaus s ian m odels with one latent variable has been studied 
bv IStanghellini and Wermuthl •: 2 00'> :. The models we treat here are special cases 
with the latent node being a common parent of all the observable nodes. As we 
review in mo re detail in Section [2l we can read ily adapt the sufficient graphical cri¬ 
teria given in Stanghellini and Wermuthl ( 2005 1 for certifying that the model AC* (G) 
of a given DAG G is generically finitely identifiable with respect to Definition 11.31 
Our own sufficient condition stated in Theorem II .3lis stronger, in the sense that ev - 
ery DAG G satisfying the sufficient conditions in IStanghellini and Wermuthl ( 2005 1 
necessari ly satisfies the condition in Theor em 11.31 However, when it applies the 
result of Stanghellini and Wermuthl ( 2005 1 yields a stronger conclusion than our 
generic finiteness result. Indeed as we also emphasize in the discussion in Section [6l 
their conditions imply that the parmetrization is generically 2-to-l. 

We will prove the above stated Theorems 11.31 and 11.41 in Section [3] Since the 
parametrization map in m is polynomial, the generic finite identifiability of a 
given model is decidable by algebraic techniques that involve Grobner basis com¬ 
putations. In Section [2 we will study the applicability of our graphical criteria 
via such algebraic computations for all models A/”*(G) of DAGs G with m = 4, 5,6 
nodes. Section [5] will give results on situations where we can determine generic 
finite identifiability of a model A/”* (G) based on knowledge about the generic finite 
identifiability of a model A/"*(G'), where G' is an induced subgraph of G. 

Before ending this introduction, however, we comment on the role that Markov 
equivalence plays in our problem. Recall that two DAGs defined on the same set 
of nodes are Markov equivalent if they have the same d-separation relations. The 
following theorem, which will be proved in Appendix [A] says that generic finite 
identifiability is a property of Markov equivalence classes of DAGs. 
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Theorem 1.5. Suppose G\ = (V. E\) and G 2 = (V, E 2 ) are two Markov equivalent 
DAGs on the same set of nodes V. Then the model A/"*(Gi) is generically finitely 
identifiable if and only if the same is true for J\f*(G 2 ). 


2. Prior work 


Stanghellini and Wermuth (2005]) give sufficient graphical conditions for iden- 
tifiability of directed Gaussian graphical models with one latent variable that can 
be any node in the DAG. We revisit their result in the context of the models from 
Definition 11.11 and formulate it in terms of generic finite identifiability. (As was 
mentioned in the Introduction, their result yields in fact the stronger conclusion of 
a generically 2-to-l parametrization.) We begin by stating a well-known fact about 
DAG models without latent variables. 

Lemma 2.1. For any DAG G = (V. E ) with m = \V\ nodes, the map 

(A, 12) ^ (I m - A T ) _1 12(/ m - A) -1 

is injective on the domain Re x diagj^ and has a rational inverse. 

Proof. For any (A, 12) S Re x diag+, let E = (a vw ) = ( I m - A T ) _1 12(/ m - A) -1 . 
Let pa(v) = {in : w —> v £ E} be the parent set of the node v. Then one can show, 
by induction on m and considering a topological ordering of V, that 

A pa(v),v \^pa(v) ,pa(v)) ^pa(v),v 

and 

lAyv = & vv E v,pa(v ) (^pa(t>),pa(i;)) ^pa(v),v'i 

compare, for instance, Richardson and Soirtesl ( 2002 . §8). 


□ 


Let the random vector X and the latent variable L have their joint distribu¬ 
tion specified via the equation system from m- Write £|e for the conditional 
covariance matrix of X given L. Then it holds that 

(2.1) S| L = (I m — A T )~ 1 fl(I m — A) -1 . 

Hence, by Lemma 12. 11 when knowing E \ L we can uniquely solve for the pair (A, 12), 
which are rational functions of E| L . Writing E for the (unconditional) covariance 
matrix of X , we have from m that 

E,l = E — (I m - A T )- 1 SS T (I m - A)” 1 . 

Consequ ently, (A, 12) can be recove r ed un iquely from E and (J m — A T ) _1 (5. The 
results of Stanghellini and Wermuth ( 200511 then address identification of the vector 
( I m — A T ) -1 (i, which holds the covariances between each coordinate of A' and the 
latent variable L. We obtain the following observation. 

Proposition 2.2 (Adapted from Stanghellini and Wermuth, 2005). Let G = (V. E) 
be a DAG. The model Af*(G) is generically finitely identifiable if 

(i) every connected component of G^ L cov = {V,E? L cov ) has an odd cycle, or 

(ii) every connected component of G c con = ( V,E^ on ) has an odd cycle. 

Proof. Theorem 1 in Stanghellini and Wermuth] ( 2005t l gives (i) or (ii) as a suf¬ 
ficient condition for identifying, up to sign, the to- vector (I m — A T ) _1 <5 when 
E = </>g(A, f2, 5) for a generic point (A, 12, S) in 0. In this case, we can uniquely 
recover the conditional covariance matrix E|e from m and also the pair (A, 12) 
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Figure 2.1. A graph G (left) satisfying the sufficient condition in 
Proposition ^. 21 with G con (middle) and G c con (right). 


by Lemma I2TT1 After identifying A, S can be solved for, up to sign, by the previous 
knowledge of (J m — A T ) - 1 <5. Hence, (*) or (ii) is in fact a sufficient condition for 
generic finite identifiability of A4(G). □ 

Figure [2~~11 shows a DAG with m = 5 nodes that satisfies the condition of Propo¬ 
sition [5721 **)• 

We conclude this review of prior work by pointing out that any model A/"*(G) 
that can be determined to be generically finitely identifiable using Proposition ^. 21 
can also be found to have this property using our new Theorem 1 1.31 

Proposition 2.3. A DAG G = (V, E) satisfying either one of the conditions in 
Proposition \2.2I necessarily satisfies the condition in Theorem 11.31 

Proof. Let G\ L:COV = {V,E\ L>cov ) and G con = {V,E con ). An edge v -A w G E 
also present itself as an undirected edge in both E^ L cov and E con . Hence, when 
ignoring the directionality of its edges, G is a subgraph of both G\L tCOV and G con 
and, thus, G c is a supergraph of both G? L cov and G c con . As such, if every connected 
component of G? L cov , or of G£ on , contains an odd cycle, the same is true of G c . □ 

3. Criteria based on the Jacobian of parametrization maps 

In this section, we prove Theorems 11.31 and 11.41 Let G = (V) E) be a fixed DAG 
with to = \V\ nodes, and let 0 := Re x diag+ xR m denote again the domain of 
the parametrization 

<!>g : (A, n, 6) —I (I m - A t ) _1 (D + SS T ){I m - A)" 1 

of the covariance matrix of the distributions in model A/"* (G). We begin by introduc¬ 
ing other mappings that are generically finite-to-one if and only if tpc is generically 
finite-to-one. 

First, it will be helpful to study the map 

(3.1) f> G : (A, fi, <5) ► {I m - A T )- 1 D(/ m - A )" 1 + SS T , 

defined on 0. Second, focusing on concentration instead of covariance matrices, we 
will also consider the maps 

(3-2) ip G ■ (A, 'F, 7) i—* { Im . - A)(^ - 77 T ){ I m - A T ), 

(3.3) (pc ■ (A, SP, 7 ) 1 —> {I m - A ) A >{ I m - A T ) - 77 T . 

Lemma 3.1. The parametrization <Pg generically finite-to-one if and only if any 
one of the maps <j>G, TG and (pc is generically finite-to-one. 




D. LEUNG, M. DRTON, AND H. HARA 


Proof. Consider first the map <f>G for which it holds that <f>G = 4>g ° g , where 

g- (A,fi,$).—►(A,n,(I TO -A T )- 1 *) 

is a diffeomorphism that maps 0 to itself. By the chain rule, the Jacobian of 4>g at 
(A, fl, <5) is the product of the Jacobian of <fa at g( A, f l, S) and the Jacobian of g at 
(A, S2, 5). Now the latter matrix is invertible on all of 0 since g is a diffeomorphism. 
It follows that there exists a point in 0 at which the Jacobian of (pc has full column 
rank if and only if the same is true for (f>G- For the Jacobian of a polynomial map 
such as (j> G and (f>c, full column rank at a single point implies generically full column 
rank; use the subdeterminants that ch aracterize a dr op in rank to define a proper 
algebraic subset of exceptions, see also lGeiger et ah , ( 200lt . Lemma 9). The claim 
about 4>g and tpc follows from Lemma 11.11 

Let h : (A, T, 7 ) 1 — > (A, \F, (I m — A)y). Since pc = PG°h, by the same argument 
as above it also holds that pg is generically finite-to-one if and only if (j)G has this 
property. 

In order to complete the proof of the lemma it suffices to show that 4>g is gener¬ 
ically finite-to-one if and only if the same holds for pg- Define another diffeomor¬ 
phism from 0 to itself as 

p : (A, fi, 8) > (A, IT 1 , (1 + . 

Writing inv for matrix inversion, we then have that 
(3.4) inv o (f>G = pg 0 P 

because of the identity (f2 + <W T ) -1 = (*F — 77 T ) with *F = f2 _1 and 7 = fc _ 1 // 2 4'(5, 
where k = 1 + d T, F<5 > 0; see e.g. iRaol (1 973 . p. 33). Using (13.411 . the equivalence 
of being generically finite-to-one for (f>G and pc may be argued similarly as for the 
maps considered earlier. □ 

Let J(pg) be the Jacobian matrix of the map (pc from (13.311 . It will be examined 
to prove Theorem 11.31 In light of Lemmas 11.11 and 13.11 we will show that if G 
satisfies the condition in Theorem 11.31 then J(pg ) is generically of full column 
rank, implying that (f>G is generically finite-to-one. Our a rgumen ts will make use 
of the following lemma that rests on observations made in 1 Vicar 3 (l 2000 h . 


Lemma 3.2. Let G = ( V , E ) be an undirected graph , and let fc ■ 1R 1 — > be the 

map with coordinate functions 

fc,vw(x) = x v x w , v — w G E. 

Then the Jacobian of fc has generic rank m — d, where m = \V\ is the number of 
nodes and d is the number of connected components of G that do not contain an 
odd cycle. 

Proof. For simpler notation, let / := fc- Let Jf be the Jacobian matrix of the 
polyn omial map /, and let ker(Jj) be its kernel. By the rank theorem (Rudin, 
1976, p. 229), the dimension of ker(J/) is generically equal to the dimension of the 
fiber recall (11.41) . Since rank(J/) = m — dim(ker( Jf)), it suffices to show that 
J-f has generic dimension d. 

Since the claim is about a generic property, we may restrict the domain of f to 
the open set X := (R.\{0}) m . This assumption is made so that Lemma 1 in lVicardl 
( 2000 1 is applicable later without difficulty. Now, fix a point y € f(X) C R^. The 
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elements of the fiber Ef(y) are the vectors x G M m , or equivalently, x G X, that 
are solutions to the system of equations 

(3.5) y vw = x v x w , v — w € E. 

Let G i = (Vi, Ei ),..., Gk = (Vk,Ek) be the connected components of G, so 
that Vi,..., Vk form a partition of V and Ei,... ,Ek partition E. Let k 1 < k be 
the number of connected components containing two nodes at least. Without loss 
of generality, assume Gk’+ i, ■ • ■, Gk are all the connected components with only a 
single node. Then the equations listed in (13.511 can be arranged to form k' disjoint 
subsystems indexed by i = 1,..., k' . The *-th subsystem has the form 


(3.6) 


Vvw — Wv-Ew') v w G Ei 


and exclusively involves the variables {x v : v G Vi}. By Lemma 1 in Vicardi 12000 :) 
and also the relevant discussion in the proof of Theorem 1 in the same paper, the 
solution set to m either contains two points or can be parametrized by a single 
free variable in M. The former case arises if and only if Gi contains an odd cycle. 
It follows that the dimension of the solution set of (13.611 is zero when Gi contains 
an odd cycle, and it has dimension one if Gi does not contain an odd cycle. In 
addition, each singleton component Gi = (Vi, 0 ) for i = k! + 1 ,..., k provides one 
additional dimension to the fiber J-f(y), since the corresponding variables in x are 
not restricted by any equations. We conclude that the dimension of Ef(y) equals 
the number of connected components G,; that do not contain an odd cycle. □ 


We return to the object of study, namely, the map (pc which sends the (2 to+ |£j)- 
dimensional set 0 = Ms x diagjjj xM m to the (^^-dimensional space of symmetric 
m x m matrices. The Jacobian J(<pg) is of size ( m + 1 ) x (2m + |£|), and we index 
its rows by pairs (v,w) with 1 < v < w < m, whereas in Section [T] we assume 
the vertex set V = {1,...,to} to be topologically ordered. We now describe a 
particular way of arranging the rows and columns of J((pc)- 

Define the set of “non-edges” as N := {(v,w) : v < w and (v,w) ^ E}; we will 
also write v -ft w to express that (v,w) G N. Also, define D := {(u,u) : v G V}, so 
that D U E U N index all entries in the upper triangular half of an m x m symmetric 
matrix. The rows of J((pc) are now arranged in the order D, E and N. The 
columns of J(<Pg) are indexed such that partial derivatives with respect to the free 
input variables in the triple (A, T,y) appear from left to right, in the order T, A 
and 7 . In other words, we partition J(<pg) into 9 blocks as follows: 

'I' A 7 
D . 

(3.7) J(<p G )= e . . 

N . 

The following lemma is obtained by inspection of the partial derivatives of (pe¬ 
lts proof appears in Appendix [A] 

Lemma 3.3. The Jacobian matrix J(<pg ) generically of full column rank provided 

that the submatrix [J(<Pg)]n, 7 is so. 

We now give the proof of Theorem 11.31 
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Proof of Theorem, 1 1. ,91 By Lemmas 11.11 and 13.31 , it suffices to show that [J(<^g)]jv ,7 
is generically of full column rank. For each v -f> w £ N, 

(3.8) [^ g (A,'F, 7)]^ = [(/ m - A)>F(/ m - A T )]^ - 7„7™. 

Note that only the right most term in (13.811 contributes to the partial derivatives of 
f>G with respect to 7 = (7-u)i;e{i,...,m}- 

Ignoring the directionality of non-edges in N, define the undirected graph H = 
(V, N ) to which we associate a map fn as in Lemma T3. 2 1 Then 

[J(tg)]n, 7 = -Jf H - 

But Jf H has generically full column rank by Lemma l3.2l because. in fact, H is equal 
to the complementary graph G c for which we assume that all connected components 
contain an odd cycle. □ 

We remark that Theorem [O] can also be proven by studying the Jacobian of the 
map 4>g from m- We chose to work with (pQ above since this allowed us to avoid 
consideration of the inverse of the matrix I m — A. For Theorem Ol however, we 
consider both (pc and </>g- 

Proof of Theorem E3 We first prove the necessity of condition (i) by showing that 
if | E con | — \E\ < d con , then the Jacobian matrix J(^g) always has row rank less 
than 2m + \E\. This implies that it cannot be of full column rank which implies 
the failure of generic finite identifiability by Lemma 11.11 

As in the proof of Theorem 11.31 we consider the set of non-edges N, which we 
now partition as IV = Ni UJV 2 , where N\ = {v -ft w £ E : v — w £ E con }, and 
N 2 = N \ Ni. Accordingly, we can partition the submatrix [J(<^g)]jv,{»,a, 7} into 
two block of rows indexed by Ni and N 2 as 

A 7 

(3-9) [•/(&?)] jv,{*,a, 7} = ^ 0 0 ... • 

To see that the submatrix [J(<^g)]jv 2 ,{^,a} = 0, observe first that an entry of 
(/ — A)'L(J — A T ) is the zero polynomial if and only if the same is true for Sj^ 1 , 
where E|^ is the matrix from KID. Second, by definition of E con and N 2 , if 
(v,w) £ N 2 then = 0. 

Next, observe that to prove the necessity of condition (i) it suffices to show that 
the rank of [J(<Pg)]n 2 ,7 cannot be larger than m — d CO n■ Indeed, if this is true, then 
there exists a subset IVj C IV2 with |A^| = m — d con , such that the submatrix 

[J{<Pg)\{D,E, JVi,JV'},{3r,A,7} 

has the same rank as the original Jacobian matrix J(<^g)- However, the submatrix 
[J(<^G)]{D,£;,JVi,Ar'},{'L,A,7} has 2m + \E con \ — d con rows, and thus its rank is less 
than 2m+ |i?| because under condition (i) we have \E con \ — \E\ < d con . As a result, 
J(<Pg) cannot be of full column rank. 

It now remains to show that [</(<(5 g)]A/ 2,7 ^ ias ran k a t most m — d con . Observe 
that the undirected graph (V, N 2 ) is equal to the complementary graph ( G con ) c . 
Moreover, [J{tg)]n 2 g is equal to the negative Jacobian of the map f(G con ) c that 
we get by applying the construction from Lemma [3~2l to ( G con ) c ; recall the proof 
of Theorem 11.31 Applying Lemma |3.21 we find that [J(^g)]at 2 ,7 has generic rank 
m — d con , which is also the maximal rank that [J(<^g)]jv 2 ,7 may have. 
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The proof of (ii) follows the exact same argument as that of (z), by replacing 
(a) G C on with G \l, C ov, (b) d con with d cov , (c) Cp G with 4> G , (d) 4' with Cl, (e) 7 with 
<5 and (f) J(<pg) with J{4 > g)i where J{4>g ) is partitioned as 

ci A s 

D . 

(3.10) J $ G ) = E . , 

N . 

similarly to (13.71) . □ 


4. Algebraic computations and examples 

As explained in Drton ( 20061 . §3) and Garcia-Puente et al. ( 201Clh . identifiability 
properties of a model such as A7* (G) can be decid ed using Grobner basis techniques 
from computational algebraic geometry ( Cox et al. . 2007t h While these techniques 
are tractable only for small to moderate size problems, we were able to perform 
an exhaustive algebraic study of all DAGs G = (V, E) with to < 6 nodes. Beyond 
a mere decision on whether the parametrization map 4>g is generically 1-to-l, the 
algebraic methods also provide information about the generic cardinality of the 
fibers of 4>g as a map defined on complex space. 


Definition 4.1. For a DAG G = (V, E), let 4% be the map obtained by extending 
4>g to the complex domain C 2m+ I- B l. If the (complex) fibers of 4'g are generically 
of cardinality k, then we say that 4 >g generically &-to-one. 


The language of Definition 14.11 allows us to give a refined classification of DAGs 
G in terms of the identifiability properties of the parametrization of model A/"*(G). 
Indeed, A f*(G) is generically finitely identifiable if and only if 4% is generically 
k- to-one for some k < 00. 


Remark. The generic size of the fibers of 4% equals the generic size of the fibers of 
the complex extensions of the three maps from Lemma 13.11 The map (p G has low 
degree coordinates and tends to be the easiest to work with in algebraic computa¬ 
tion. Another approach that can be useful i s to adapt the algorithm described in 


Section 8 of the supplementary material for Foveel et al. ( 2012l h To do this note 


that for A £ C E there exist complex choices of f l and S such that </>g(A, Cl, d) = £ 
if and only if (J — A T )S(J — A) is a matrix that is the sum of a diagonal matrix, 
namely, Cl, and a symmetric matrix of rank 1, namely, SS T . Whether a matrix is of 
the latter type can be tested using tetrads, that is, 2 x 2 subdeterminants involving 
only off-diagonal entri es of the matrix; see also (15.41) below. The tetrads of a matrix 


form a Grobner basis ( de Loera et al. , 1995 , iDrton et al. , 2007 ). 


Table Q] lists out the counts of DAGs G = (V, E), with 4 < m < 6 nodes, that 
have (f>Q generically fc-to-one, for all possible values of k. The table also gives the 
the counts of DAGs satisfying the conditions in Theorems 11.31 and Ol as well as 
Proposition 12.21 DAGs with ( m ^ 1 ) — 2 m < |i?|, which trivially give generically 
oo-to-one maps 4g i n y i ew of Corollary 1 1.21 are excluded. We emphasize that the 
counts are with respect to unlabeled DAGs, that is, all DAGs that are isomorphic 
with respect to relabeling of nodes are counted as one unlabeled graph. 

In the considered settings the condition in Theorem 11.31 is very successful in 
certifying DAGs with a generically finitely identifiable model. For instance, when 
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Table 1. Counts of unlabeled DAGs G with to nodes, at most 
— 2m edges, and complex parametrization <pQ generically k- 
to-one. Counts are also given for DAGs that satisfy the sufficient 
conditions from Thm. 11.31 and Prop. 12.21 and DAGs that fail to 
satisfy the necessary condition from Thm. 11.41 


m 

4 

5 

6 

k < oo 

5 

95 

3344 

II 

to 

5 

87 

2961 

II 

0 

8 

345 

k = 6 

0 

0 

24 

k = 8 

0 

0 

14 

Prop.E2] 

5 

49 

985 

Thm. 11.31 

5 

88 

2957 

k = oo 

1 

20 

552 

Thm. 11.41 

1 

20 

361 

Total # of DAGs 

6 

115 

3896 


m = 6, it is able to correctly i dentify 2957 out of 3344 such graph s. The previously 
known sufficient condition of IStang helli ni and Wermuthl (|2Q05l ) identifies 985 of 
them. Our necessary condition in Theorem 11.41 is also useful in assessing graphs 
that give generically infinite-to-one models. For instance, when m = 6, we find that 
361 of 552 such graphs violate the condition; recall the example from Figure IT721 
While, by Proposition 12.31 our sufficient condition in Theorem 11.31 is stronger 
than t hat in Propositio n 12.21 for g eneric finite identifiability, the latter condition, 
due to Stanghellini and Wermuth ( 2005t f. in fact implies that tpQ is generically 2- 
to-one. For m = 5, there are 6 DAGs that satisfy the condition in Theorem 11.31 but 
give generically 4-to-one maps c/)q. The graph from Figure fTTTI is an example. We 
note that for this graph G the fibers of (f>Q intersect the statistically relevant set 0 
in either 2 or 4 points, and both possibilities do occur. 


5. Subgraph extension 

This section concerns results on how we can extend knowledge about identifi¬ 
ability of an induced subgraph to that of the original DAG. We recall standard 
terminology in graphical modeling. For a given DAG G = (V, E ), we write pa{y) = 
{re : w —> v € E} for the parent set of the node v, and ch(v) = {w; : v -A- w G E} for 
the child set of v. If for some node s £ V there does not exist a node s' £ V with 
s —» s' £ E, then s is a sink node. If there is no other node s' &V with s' —»• s € E, 
then s is a source node. The following theorem is the main result of this section. 

Theorem 5.1. Given a DAG G = (V. E), if there exists 

(i) a sink node s £ V such that pa(s) ^ V \ {s} and the model Af*(G') of the 
induced subgraph G' on V \ {s} is generically finitely identifiable, or 

(ii) a source node s € V such that ch(s) ^ V \ {s} and the model Af*{G') of 
the induced subgraph G' on V \ {s} is generically finitely identifiable, 

then the model Af*(G) is generically finitely identifiable. 
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Recall that in Table [T| there are 3344 — 2957 = 387 DAGs with m = 6 nodes 
that are generically finitely identifiable but do not satisfy our sufficient condition 
from Theorem 11.31 The above Theorem 15. II provides a way to certify identifiability 
of models falling within this “gap”, provided that we have knowledge of which 
DAGs on to = 5 nodes are generically finitely identifiable. For instance, from 
our algebraic computations we know that there are 95 — 88 = 7 DAGs that are 
generically finitely identifiable but cannot be proven to be so by Theorem 11.31 Of 
the 387 aforementioned DAGs on 6 nodes, 194 can be proven to be generically 
finitely identifiable by using the knowledge about the 7 graphs on m = 5 nodes 
and applying Theorem 15.11 We remark that if a DAG satisfies the condition in 
Theorem 11.31 the resulting supergraph obtained by augmenting a sink (source) 
node that does not have every other node as its parent (child) must also satisfy the 
condition in Theorem fOl Hence, given current state-of-the-art, Theorem 15.11 is 
useful primarily as a tool to reduce the identifiability problem to smaller subgraphs 
that may then be tackled by algebraic methods. 

Theorem O is obtained by studying the maps and ipc in OD and & 
First consider m- In light of Lemma ll.ll iib we can show that <f>G is generi¬ 
cally finite-to-one if there exists a proper algebraic subset H C K 2m +I- E l such that 
l- ? >Gle\s( 0 o)| < OO for all d 0 = (A 0 , D 0 , <5 0 ) € 0 \ S, or equivalently, 

(5.1) (I m — A T )^ G (0 o )(/ m — A) = + SS T , 

has finitely many solutions for (A, H, 5) in 0 \ 5. Throughout this section, S is 
taken so that all points (A, D, 6) £ 0 \ 3 have Si ^ 0 for all* = 1 ,..., to . As such, 
the matrix D + SS T on the right hand side of m has all entries nonzero and is 
known as a Spearman matrix. 

Definition 5.1. A symmetric matrix T £ 1R mxm s i ze m > 3 ; s a Spearman 
matrix if T = Q + 55 T for a diagonal matrix f2 with positive diagonal and a vector 
S with no zero elements. 


Any Spearman matrix T is positive definite, and it is not difficult to show that 
if T = fl + 55 T is Spearman with to > 3 then the two summands Q and SS T are 
uniquely determined as rational functions of T. Moreover, SS T deter mi nes S u p to 
sign c hange. For these facts see, for instance, Theorem 5.5 in Anderson and R ubin 
( 1956l l. We term D the diagonal component of T, and 55' the rank-1 component. 
The following theorem gives an implicit characterization of Spearman matrices of 
size to > 4. 


Theorem 5.2. A positive definite symmetric matrix T = ( Vjj ) £ R m “ of size 
to > 4 is a Spearman matrix if and only if, after sign changes of rows and corre¬ 
sponding columns, all its elements are positive and such that 

(5-2) VijVkl VikVji = VuVjk VikVji = VijVkl VuVjk — 0 

for i < j < k < l, and 


(5.3) VuVjk - v ik Vji > 0 

for j ^k. 

This is essentially the same as Theorem 1 in Bekker and de Lee uwl (1987 1. which 
the reader is referred to for a proof. Unlike Bekker and de Leeuwl ( ToSa )'. we have a 
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strict inequality in (15.31) since in Definition 15.11 we require the diagonal component 
of a Spearman matrix to be strictly positive. 

The three polynomial expressions in (15.21) are the 2x2 off-diagonal minors of the 
matrix T, which are also known as tetrads in the literature. We call the quadruple 
i < j < k < l the indices of the tetrad they define. Note that 

VijVkl VuVjk — (,VijVkl VikVjl') {v%l Vjk ) 

so that the three tetrads in (ED are algebraically dependent. In general, a sym¬ 
metric to x m matrix T has 2(™) algebraically independent tetrads and we write 
TETRADS(T) to denote a column vector comprising a choice of 2(™) algebraically 
independent tetrads. 

For each triple (A, 0, 5) £ 0 \ 2 that solves (15.11) . it must be true that 

(5.4) TETRADS ((/ m - A T )<fc(0 o )(/ m - A)) = 0. 

Together with the uniqueness of the diagonal and rank-1 components for a Spear¬ 
man matrix, if we can show only finitely many A’s solve the system El, then we 
have shown that the model JV*(G) is generically finitely identifiable. Our proof for 
Theorem EUO follows this approach. 

Alternatively, based on Lemma 13. II we can also prove generic finite identifiability 
by considering the map ipc from m - We then need to show that there exists a 
proper algebraic subset S C R 2 m +I £; l so that |^, G | eXH (0o)| < °o for all 9 0 = 
(A 0 , T 0 , 7 o) S 0 \ S, or equivalently, 

(5.5) (I m - A)" V G (0„) (Im - A T ) -1 = T - 77 t 

has finitely many solutions for (A, 4/, 7 ) in 0 \ S. Again we assume that 5 is defined 
to avoid issues due to zeros, that is, every triple (A, 4>, 7 ) £ 0 \E has 7 * yf 0 for all 
i = 1,..., to. We introduce the term coSpearman matrix to describe the matrix on 
the right hand side of (15.51) . 


Definition 5.2. A symmetric matrix T £ R mxm of size to > 3 is a coSpearman 
matrix if T = T — r ) r ) T for a diagonal matrix ’I' with positive diagonal and a vector 
7 with no zero elements. 


Again, the diagonal com ponent T and the r ank-1 component r y'y T are uniquely 
determined by T; compare Stanghellinil (1997, p. 243). The following theorem is 
analogous to Theorem 15.21 


Theorem 5.3. A positive definite symmetric matrix T = ( Vij ) £ R mxm 0 y g ^ e 
to, > 4 is a coSpearman matrix if and only if, after sign changes of rows and 
corresponding columns, all its non-diagonal elements are negative and such that 

(5.6) VijVkl VikVji — VuVjk Vik'Vji = VijVkl VnVjk — 0 
for i < j < k < l, and 

(5.7) VuVjk - VikVji < 0 
for ijtj^k. 

Using the tetrad characterizations (15.61) and the uniqueness of diagonal and rank- 
1 components, one can now demonstrate that the restricted map has finite 

fibers by showing that the system of tetrad equations 

(5.8) TETRADS((/ m - A)"Vg (0o) (E - A T ) -1 ) = 0 
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admits only finitely many solutions for A when do £ 0 \ 2. 

The finiteness of solutions in A for the system (15.411 . or (15.811 . is a sufficient 
condition for the generic finite identifiability of _/V*(G). It is, however, not obvious 
that these two systems necessarily have finitely many solutions when 7V*(G) is 
generically finitely identifiable. The following lemma states that such a converse 
does hold for the following two types of DAGs, whose generic finite identifiability 
can be easily checked by Theorem 1 1. 51 Recall that the notation “ C ” means “being 
a proper subset of’. 

Lemma 5.4. Let G = (V, E ) be a DAG with vertex set V = {1,..., to}. 

(i) If E C {(k,m) : k < to — 1}, then there exists a proper algebraic subset 2 
such that for all do = (Ao, flo, 6q) 6 0 \ S, the system 

TETRADS {(I m - A T )<fc (0„) (I m ~ A)) = 0 

is linear in the variable A G Re and is solved uniquely by A = Ao- 

(ii) If E C {(l,fc) : k > 2}, then there exists a proper algebraic subset 2 such 
that for all do = (Ao, To, 7 o) £ 0 \ 2 , the system 

TETRADS((I m - A)'Vg (0 o ) (4, - A T ) _1 ) = 0 

is linear in the variable Aglg and is solved uniquely by A = Ao. 

The proof of Lemma 15.41 is deferred to Appendix lAl 

Proof of Theorem \5.1[ We will first prove (z), which uses Lemma l5.4l z'l. The proof 
of (ii) will follow from similar reasoning using Lemma l5.4( zz). 

Without loss of generality, assume that the sink node s = to, by giving the 
nodes a new topological order if necessary. Define two DAGs as follows. First, let 
Gi = (Vi,Ei) be the subgraph of G induced by the set Vi = V \ {to} = [to — 1], 
where we adopt the shorthand [fc] := {1,..., A;}, k £ N. Second, let G 2 = (V, E\Ei) 
be the graph on V obtained from G by removing all edges that do not have the 
sink node m as their head. As before, let 0 := Rb x diag+ xK™. We will construct 
a proper algebraic subset 2, such that for any d £ 0 \ 2, the fiber (6) is 

finite. Then Lemma 1 1.1 D P applies and yields the assertion of Theorem IS-ll zb 
Let 0i := x diag^! xR m_1 , the open set on which the parametrization 
(f>Gi °f model 7V*(G 1 ) is defined. By assumption, there exists a proper algebraic 
subset 2} C R 2 ( m- 1 )+l El l such that the restricted map </>Gilei\=i h as finite fibers, 
by Lemma Tl. 11 ii). Extend 2} to a proper algebraic subset of R 2 m +I- E l by defining 

2i := 2} x R ^ 1 x R 2 , 

where R ^^ 1 accommodates the additional free variables X vm with v £ pa(m), and 
R 2 accommodates the two variables Ll mm = co m and S m . 

Next, recall that for a given point d' = (A ',n',d') £ 0, any G (d') 

must satisfy the tetrad equations 

(5.9) TETRADS ((/ m - A T )(f G (e')(I m - A)) = 0. 

Let Abj := (A VW )J V Then any tetrad in (15.91) with indices i < j < k < m has 

the form 

^ ^ Q>m' j )) ^m'm 

m' (£pa(m) 


b (A Ei Ag(0')) , 
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where the a m ' as well as b are polynomials with the entries of Abi and the entries 
of a symmetric m x m matrix being their variables. Let A pa (m),m be the vector 
with entries X vm for v £ pa{m). Then the part of the system (15.911 involving the 
variables A„ im , v £ pa{m ), has the form 

(5-10) c (A Ei Ag( 9')) \a(m),m = C {X El , 4 >g{0')) , 

where C is a matrix of size 2( m ^" 1 ) x \pa(m)\, and c is a vector of length 
Both C and c are filled with polynomials in the entries of Ab! and a symmetric 
to x to matrix. Since (A, 12, <5) £ T ( p G {6'), we have c/)g{0') = 4>g{ A, 12,6) and, thus, 

(5-11) C (As,, </>g(A, 12,6)) A pa(m ) im = c(Ab 15 0g(A, SI, 8)) . 

As O’ was an arbitrary point in 0, (15.111) holds for all (A, £1,6) £ 0. We claim 
that C (X El , 4 >g( A, 12, (5)) is of full rank for generic choices of (A, 12, 6). To see this 
note that if Abi is set to 0, then (15.111) becomes the system of tetrad equations for 
the graph G 2 . Using Lemma |5.4f i) and the assumption that pa{m) C V \ {s}, we 
see that C {X El , 0 g(A, 12, 5)) achieves full rank for X El = 0 and a generic choice of 
(A pa (m),mi 12,5). We deduce that the rank is full generically. 

Let S 2 be a proper algebraic subset such that C (Ab i; 0 g(A, 12, 5)) is of full rank 
for any (A, 12,6) £ 0 \ S 2 . Let S 3 be the (algebraic) set comprising all triples 
(A, 12, S) with at least one coordinate Si = 0, and define E := Si U S 2 U S 3 . Clearly, 
S is a proper algebraic subset of R 2 m +I B I. Take (Ao, 12o, 6o) to be a point in 0 \ S 
and define Eo := </>g(Ao, 12 o, ^o)- It remains to show that the equation system 

(5.12) E 0 = <fc(A, 12, S) = (I m — A t ) -1 (!2 + SS T )(I m - A) -1 

has only finitely many solutions in (A, 12, S) over the set 0 \ S. 

We begin by observing that because s = m is a sink node, by taking a submatrices 
in (15.121) . we obtain the equation system 

(Eo) [m—l] = [{Im — A T ) 1 (12 + SS T ){I m — A) 

= {Im -1 -^-[m— 1 ] ) (H[to— 1 ] + ){Im — l ~ A[ m _i]) 

= ^Gi 1]; 12[ m — 1] > S[ m _ i]) . 

Here, for an index set W C [to], we write xw to denote the subvector xw = 
{x v : v £ W ) of vector x = (xi ,..., x m ) T , and we similarly write Ay/ for the 
W x W principal submatrix of a matrix A. Let S C 0 3 be the projection of the 
set of all triples (A, 12, <5) £ 0 \ S that solve (15.121) onto their triple of submatri¬ 
ces/subvector (A[ m _ 3 ], 12[ m _ 1 ], By choice of S, we have that S C 0i \ Sj 

and, since 0Gilei\E' b as finite fibers, we know that S is finite. However, a triple 
(A^.jp llr,^!], £[,„_!]) £ S determines the matrix C and the vector c in (15.111) 
and, by choice of S, we may deduce that A pa (m),m is uniquely determined by 
(A[ m _ 1 j, !![„_!], It follows that the solutions to (15.121) that are in 0\£ have 

their A part equal to one of |iS|/2 many choices; recall that if (A^.q, f2[ m _q, <5[ m _q) 
is in S then so is (A[ m _q, ll^.q, — 5[ m _!]). The proof is now complete because A 
determines the Spearman matrix 

{I m — A T )E 0 (/ m — A) = 12 + <5<5 t , 

for which the diagonal component 12 and the rank -1 component 55 T are uniquely 
determined. Given the fact that SS T determines S only up to sign, (15.121) has 
|5| < oo solutions over 0 \ S, which concludes the proof of (i). 
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The proof of (ii) is analogous, and we only give a sketch. Instead of considering 
(f>G we turn to ifG, which also has domain 0. Without loss of generality, we let 
the source node be s = 1. We then define G\ = (Vi, id) to be the subgraph of G 
that is induced by Vi = {2,..., m}, and we let Gd = (V,E\ E\). We consider the 
parametrization ipGi with domain 0i = M. El x diag^^ xR m_1 . By assumption, 
N*(G i) is generically finitely identifiable, so there exists a proper algebraic subset 
3 1 such that <*pg x leys', has finite fibers, by Lemma H.lf n). 

On the other hand, for any (A, 4/, 7 ) G 0, we have 

TETRADS ((/ m - A)“Vg(A, T, 7 )(/ m - A T ) _1 ) = 0. 

Let A El := (Kv>)Jy tW ) €El and Xi, c h(i) ■= (Ai „)^ ech(1 y Then the tetrad equations 
with one index equal to s = 1 yield the equation system 

C (^Si,^g(A, ^,7)) A ljC/l (!) = c(\ Ei ,‘Pg (A, U/,7)), 

where part (ii) of Lemma PTTl can be applied to show that G (A^, ^>g(A, 4/, 7 )) is 
of full rank outside some proper algebraic subset 52 . We may then define a set 5 
as in the proof of part (i) and use arguments similar to the ones above for a proof 
of part (ii) of our theorem. □ 


6. Discussion 


In this paper we studied identifiability of directed Gaussian graphical models 
with one latent variable that is a common cause of all observed variables. To 
our knowledg e, the best criteria to decide on identifiability of such models are 
those given bv IStanghellini and Wermuth (2005) who consider a more general setup 
of Gaussian graphical models with one latent variable. Their results provide a 
sufficient condition for the strictest notion of identifiability that is meaningful is this 
context, namely, whether the parametrization map is generically 2-to-one. Recall 
that the coefficients associated with the edges pointing from the latent variable to 
the observables can only be recovered up to a common sign change. 

In our work, we take a different approach and study the Jacobian matrix of the 
parametrization, which leads to graphical criteria to check whether the parametriza¬ 
tion is finite-to-one. Our sufficient condition covers all g raphs tha t can be sh own 
to hav e a 2-to-one parametrization by the conditions of lStanghellini and Wermuth 
020051 ). However, our sufficient condition, which is stated as Theorem 11,31 covers 
far more graphs as was shown in the computational experiments in Section 01 Our 
Theorem II .41 describes a complementary necessary condition. 

By studying tetrad equations, we also give a criterion that allows one to deduce 
identifiability of certain graphs from identifiability of subgraphs (Theorem 15.11) . 
This result is stated for generic finite identifiability but as is clear from the proof, 
the result would also confirm that the parametrization of a graph is generically 2- 
to-one provided the involved subgraph has a generically 2-to-one parametrization. 

The extension result from Theorem 0 can be used in conjunction with the 
results obtained by the algebraic computations in Section 0] These computations 
solve the identifiability problem for graph s with up to 6 nodes. In partic ular, 
we confirm that the sufficient conditions of Stanghellini and Wermuth (2005) are 
not necessary for the parametrization map to be generically 2-to-one and provide 
examples of graphs that yield a generically finite but not 2-to-one parametrization. 
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As mentioned above, we studied models with one latent source 0 that is connected 
to all nodes that represent observed variables. However, the graphical criteria in 
Theorems 11.31 and 11.41 can be readily extended to models with some of these factor 
loading edges missing. Given the previously used notation, we describe such models 
as follows. Let G = (V, E) be a DAG with vertex set of size to = |V|; these vertices 
index the observed variables. Let V' C V be the nodes representing observed 
variables that do not directly depend on the latent variable. Then only the edges 
0 —>• v with v £ V \ V' are added when forming the extended DAG G. The 
parametrization of the Gaussian graphical model determined by G and V' is the 
restriction of < j>c from m to the domain 

0(V') := {(A, fi, 5) £ 0 : S v = 0 for all v £ V'} . 


When the parametrization maps 4>g, <PG and (pc are restricted to the same domain, 
the assertion of Lemma m still holds. Th e correspo nding identifiability results, 
which are in the spirit of Corollary 1 in lGrzebvk et al. (2004), are stated below. A 
brief outline of their proofs is given in Appendix 0 


Theorem 6.1 (Sufficient condition). Let G = (V,E) be a DAG, and let V’ C V. 
If every connected component of (G c )v\v> the subgraph of G c induced by V \ V', 
contains an odd cycle, then the parametrization map (pc is generically finite-to-one 
when restricted to the domain ©(V 7 ). 


The necessary condition given next makes references to the graphs G con and 
G\l, C ov that were defined in the introduction. 


Theorem 6.2 (Necessary condition). Let G = (V, E) be a DAG, and let V' C V . 
In order for the restriction ofcpa to the domain ©(V 7 ) to be generically finite-to-one, 
it is necessary that the following two conditions both hold: 

(i) Let G c con = (V \ V ', E con ) be the subgraph of G£ on induced by V \ V 7 . If 
d CO n is the number of connected components in the graph G c con that do not 
contain any odd cycle, then \E con \ — \E\ > d con . 

(ii) Let G? l cov = (V \ V',E\ L cov ) be the subgraph of G c con induced by V \ V'. 
If d cov is the number of connected components in the graph G? L cov that do 
not contain any odd cycle, then \E\L tCOV \ — \E\ > d cov . 


While Theorems o and 16.21 may be useful in some contexts, models in which 
latent variables are parents to only some of the observables deserve a more in-depth 
treatment in future work. In particular, i t wou ld be natural to seek ways to combine 
th e results of Stanghellini and W ermuth (12005 ) and the present paper with the work 
of Fovgel et al.l ( 2012 1 and Drton and Weilisl ( 201 fJ i . 


Appendix A. Proofs 


Proof of Lemma U.ll We may assume d > n, otherwise J/ is never of full column 
rank. The implication (?) => (ii) is obvious. 

To show (ii) => (in), suppose for contradiction that J/ is not generically of full 
rank. Since / is polynomial, we then know that Rank( Jf) = r < n generically, that 
is, outsid e a pro per algebraic subset S' C 1" the rank is constant r. By the rank 
theorem ( R.udinl . Il97fil . p. 229), for every point s £ S \ (S' U S), we can choose an 
open ball B(s) that contains s, is a subset of S\(S'US) and for which the restricted 
map /|e 3 has fibers of dimension n — r > 0, contradicting (ii). 
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It remains to show (Hi) =>■ (*). We observe that since / is a polynomial we can 
assume S = R”. We then show that the set of points with an infinite fiber, denoted 

F/ := {* € R” : \T f (a)\ = oo}, 

is contained in a proper algebraic subset of R n . We note that it suffices to assume 
n = d, for without loss of generality, we can permute the d component functions of 
/ and assume that 7 r o / : R” —R ra has a generically full rank Jacobian matrix, 
where 7 r is the projection onto the first n coordinates. Then ¥ f C F^o/. 

Now, assume d = n, and let C = (s £ R" : det J/(s) = 0} be the set of critical 
points of /, where Jf is the Jacobian matrix of /. Note that by assumption C is a 
proper algebraic subset of R”. 


Claim. If y G R™ is a point such that \J-f(y)\ = 00 , then Ef(y) lid/ 


Proof of the Claim. If an algebraic set like J-f( y) is infinite, then it has dimension 

one can see that there 
—s> Pf(y) such that the 


k > 0. By semialgebraic stratification (IBasu et al 
exists an open set U C R fc and a differentiable map g 


2006b 


U 


Jacobian of g has full rank on U. If J-f(y) D C = 0, then the chain rule yields that 
the composition f o g :U —> {y} has Jacobian of positive rank. This, however, is a 
contradiction because / o g is a constant function. Hence, J-f(y ) f! C/0. □ 


The claim implies that F/ C / _ 1 (/(C)) C / _ 1 (/(C)), where /(C) is the Zariski 
closure of the semialgebraic set /(C). Since /(C) is algebraic, so is / _ 1 (/(C)) given 
that / is a polynomial. To finish the proof we only need to show that / _ 1 (/(C)) has 
dimension less than n, which is equivalent to / _ 1 (/(C)) 7 ^ R". By Sard’s theorem 
( Basu et ahl . l2006 . p. 192), /(C), and thus also /(C), has dimension less than n. If 
/ _ 1 (/(C)) = R", then the inverse function theorem, which says that the restricted 
map /|r»*\c is a local diffeomorpliism, is contradicted. □ 


Proof of Theorem \l.fA Let m = |V|. For i = 1,2, let Gi = ( V , Ei) be the extended 
DAG of Gi, i.e., V = (0,1,..., m}, and Bj=£iU{0-H):t)6{l ,..., m }}- By 
the well-known characterization that two DAGs are M ar kov equivalent if and only 
if they have the same skeleton and v-structures ( Pearl . 2009 ). it is easy to see that 
Gi and G 2 are also Markov equivalent. 

For i £ {1, 2}, let ©i := R^ x diagj^ xR m . Define 


% ((A, n, 6)) = (I m + 1 - A T )- 1 fl(I m +i - A)“ 
x diag)5) xR m , A is a (m + 1) x (m + 1) matrix such that 




where 0 , := . 


{ 6 W if v = 0, w = 1 ,... , m, 

A vw if v,w = 1,.. .,m, 

0 otherwise, 

and is a diagonal matrix with floo = 1 and Cl vv = f l vv for v = 1,.. .m. Then 
the image <f>Q. (0i) is the set of all covariance matrices of (to + l)-variate Gaussian 
distributions that obey the global Markov property of Gi and have the variance of 
node 0, which represents the latent variable L , equal to 1. Consider the projection 
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where £ has its rows and columns indexed by {0,..., m}. Then the parametrization 
map for the latent variable model N*{Gi) equals 

( A . l ) (j>Gi = 7T o . 

Since G\ and G 2 are Markov equivalent, (0i) = H>q 2 (0 2 ). By Lemma 12.11 
each map is injective on 0^ with rational inverse defined on the common image 
$^1 (0i) = 1 >q 2 (02)- From (IA.1D . we obtain that 

ct) Gi = n o ^ = n ° ° ° $ Gi = <fc 2 ° 0 $ Gi) • 

Since 4m 1 o : 0i —> 0 2 is a diffeomorphism, the chain rule implies that the 
Jacobian of 4>g 1 can be of full column rank if and only if the same is true for 4>g 2 - 
Since (j>Gi are polynomial, the two Jacobians either both have generically full rank or 
are both everywhere rank deficient. By Lemma ll.il (f>Gi is generically finite-to-one 
if and only if </>g 2 is so. □ 


Proof of Lemma, HOI We first give the structure of J{(pG ) block by block, 
(a) “[J( < Pg)]d,{^,a, 7 }”: For a given pair (v,v) G D, 

[<Pg{ A, Jb 7)]™ = i’v + ( ^ X lw ) - 7 v- 

\w:v—>w£E / 


Hence, 

(A.2) 

(A.3) 

and 

(A.4) 


1 if v = w, 

[J(.VG)\{ v ,v)rt w = \>? vw if v^>-w£E, 
0 otherwise, 

2A wui>u if v = w, 

0 otherwise, 

J — 2y u if v = u, 

I 0 otherwise. 


[j(. , pGy\(v,v),\ wu — 


['J( < ^g)]( v,v),Jv. ~ 


(b) “[J (<^g)]e,{'I',a, 7 }” : For anyvAtoefi, 

( 

[<^g(A, Hi, 7 )] , u'u; — X vw 1p w + 


'y ' X vu X wu 'lp u I 7vlu 


\ v^u&E 
\ 'w—tu(zE 

Hence, 

{ -X vw if u = w, 

X VU Xwu if v -> u G E and w -> u G E, 
0 otherwise, 


(A.6) [J{f>G)]v^tw,\ u3 , — 


—ipw if v = u, w = x, 

X wx ip x if u = v, u — >iG£ and w —> x G E, 

A vx'f’x if u = w, it —> x £ E and v — » x G E, 

, 0 otherwise, 
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and 


(A.7) 


! -lw if v = u, 
-7« if w = u, 

0 otherwise. 


(c) “[</(<^g)]jv,{*,a, 7 }” : For any v w £ N, 


(A.8) 


[<Pg(A, ^,7)]^ 


E 


^vu^wu'lpu 




- r Yv”fw 


Hence, 


(A-9) 


[</ (^G^Vy^W^u 


^vu^wu if v —» u £ A and w -* u £ E, 
0 otherwise, 


(A.10) [</(v?g)]w/md,A„ x 


{ ^wx^x if u = v,u^x£E and w —> x £ E, 
\ vx il>x if u = w,u^-x£E and v —> x £ E, 
0 otherwise, 


and 


(A.11) 


[J{<PG)\v+ w,-y u 


{ -7w if V = u, 

—7« if w = u, 

0 otherwise. 


With slight abuse of notation, let |4'|, |'y|, |A| denote the number of free variables 
in 'I', 7 and A respectively. Considering that \D\ = |'h| and \E\ = |A|, we must 
have that \N\ > |y| since J((pG ) is a tall matrix. Hence, if [J(<^g)]jv ,7 is generically 
of full column rank, then there exists a subset N’ C N such that |7V'| = (yl and 
the determinant of J{(pg)n',^ is a nonzero polynomial in the variables of 7 , in 
consideration of (IA.11I) . Now it suffices to show that the (2m + |2?|) x (2m + |.E|) 
square submatrix [J(<^G)]{D,B,Ar , },{'i',A, 7 } is generically of full rank. 

Since the concerned matrix has polynomial entries, we need to show that the 
determinant of [J(<^G)]{D,£;,Af , },{'i',A, 7 } is a nonzero polynomial. To this end, it 
is sufficient to show that the determinant is a nonzero polynomial in the entries 
of (A, 7 ) when we specialize i/h = ■ • • = ip m = 1. Noting that |T| + |A| + |y| = 
|D| + \E\ + |AT'|, let P denote the set of all permutation functions mapping from 
the set D U E U N' to the set of free variables in A, T and 7. Choose any ordering 
of the elements of domain and codomain so as to have a well-defined sign for the 
permutations. Then by Leibniz’s formula, we have 

det ([J(<^g)] { £>,b,jv' } , { 4 ',a, 7 }) = E s S n ( cr ) II j (^g) s , ct ( s )- 

o-eP seDuPuJV' 
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Let P be the subset of all permutations a £ P with er((u, u)) = ip v for all (v, v) £ D 
and a((v,w)) = X vw for all (v,w) £ E. Then we obtain that 

det ([J(^G)]{D,_E,Ar'},{^,A,7}) 

= ^sgn((7) J{<Pg) s ,<t(s) + II J (^G) s ,a( s ) 

CT gp sEDUEUJV' a£P\P seDUEUN' 

(A.12) = ±det(J(^ G )jv', 7 ) + s S n ( <J ) II j (Vg) s ,*(s), 

aGP\P sGDUEUN' 

where the equality in (IA. 121) follows from (IA.2I) , (IA.6I) and the fact that if\ = ■ ■ ■ = 
if m = 1. We also deduce from (IA.2I) - (1 A. Ill) that every summand in the second 
term of (IA. 121) is either zero or a polynomial term involving free variables of A. In 
contrast, det(j(<^ G )Ar/ j7 ) is a nonzero polynomial only in free variables of 7 and 
can thus not be canceled by the second term in (IA.12I) . □ 

Proof for Lemma [M We first prove (*). Since A/"*(G) is generically finitely identi¬ 
fiable by Theorem ll.31 there exists an algebraic subset S' such that for all 6 £ 0\S', 
|J* O (0)| < 00. Define S to be the union of S' and the set of triples (A, fl,<5) £ 
R 2m +I- E l with at least one coordinate Si = 0. Let E 0 = </>g(Ao, D 0 , <5o) and 

(A.13) 5 = (ay) := (7 m - A T )E 0 (/ m - A). 

Then for 1 < i < j < ra, 

Sij = ^ ^ ^ki Po] kk' ^k'j ^ ^ ^ ^ Afcipojfc? T Po]ij 

l<k,k' <m l<fc<m l</c<m 

^ ^ [^o]ik^km T [^o]im if j = A77-, 

(k,m)€E , 

[Eo]y if j < m , 

where the last equality follows from the fact that Ay are nonzero only when (i,j) £ 
E. Hence, for any four indices 1 <i<j<k<l<m, the tetrads 

j $kl S ik Sjl, SuSjk SikSji 

are constant polynomials when l < m and have degree 1 in the variables {X vm ■ 
(v,m) £ E} when l = m. The equation system 

TETRADS(S') = 0 

is a thus a consistent linear system that can be represented as 
(A.14) GA pa(m),m C, 

where A pa (m),m = (^'um)^g po (m) i s the vec t° r of all free A variables, G is a 2(”’g~ 1 ) x 
|pa(m)| matrix and c is a 2( m “ 1 )-vector. Both G and c depend only on E 0 . 

To finish the proof, we now need to show that (IA.14I) is uniquely solvable in 
A pa (m),m- We will aim to contradict \P</> G (0o)\ < oo if (IA. 141) does not have a 
unique solution. Note that the solution set is an affine subspace C C RI b L For a 
contradiction, suppose that C is of positive dimension. Upon substituting A = Ao 
into (IA.13I) . we obtain 

So = ( 4 ) = (An - A^)Eo (Im - Ao), 
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and in consideration of (|5.3| ) in Theorem 15.21 it must be true that 
s°iS° jk - s° fc s°, > 0, for all i ^ j j- k. 

We may then pick an open ball £>(A 0 ) such that for all solutions A £ C D £>(A 0 ), 
the matrix S = (sy) defined by (IA. 131) satisfies 

SiiSjk Sik^ji 0 , for all % 7^ j k. 

It follows that C D £>(A 0 ) is an infinite set whose elements A all make the matrix 
( I m — A T )Eo (I m — A) a Spearman matrix. Hence, the system 

(I m ^ A T )Z 0 (I m - A) = ft + 55' 

has infinitely many solutions, contradicting |^ r </ )G (6 , o)| < oo. 

The proof of (ii) is analogous. We first let To = <^g(Ao, To, 70) and define 

(A.15) S=(S ij ) = (I m — A) _1 T 0 (/ m — A T ) _1 . 

Noting that in this case (I m — A) -1 = I m + A, it can be easily seen that 

TETRADS(S) = TETRADS ((/ m - A) _1 T 0 (/ m - A T ) _1 ) = 0 

is a linear system in the variables {Ai„ : v € c/i(l)}. Similar to the above arguments, 
we may use Theorem ll.ll and Theorem [53] to prove by contradiction that the system 
can only have a unique solution in {Ai„ : v £ ch( 1)}. □ 

Proof of Theorems \6. 1\ and 1 6. 31 For Theorem 16.11 one can partition the Jacobian 
matrix J(5pg) of (pc as in (13.71) . only with 7 replaced by 7y\y< = {7^ : v £ 
V \ V'}. In analogy with Lemma [3.31 it can be shown that J^ G is of column full 
rank if [</(<^g)]at ) 7 VXV , is. The reasoning is then analogous to that in the proof of 
Theorem Ol the main step being the application of Lemma 13.21 where the graph 
defining the considered map becomes ( G c )y\y. 

The proof of Theorem 16.21 is analogous to the proof of Theorem 11.41 The only 
change is to replace G c con , Gf L cOT , 7 and 5 by G c con , G c ( Lcov , 7 y\y, and <V\V', 
respectively. □ 
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